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ABSTRACT 

Cambarus harti is a state-listed endangered, endemic crayfish found only in three 
counties in mid-west Georgia. Several studies have attempted to characterize the biology and 
ecology of this crayfish, however data regarding the distribution of this rare, endemic crayfish 
remains limited. The International Union for Conservation of nature stated that in order to create 
an effective conservation plan, the known distribution must be expanded. Species distribution 
models are a cost-effective way to identify locations that have similar habitat characteristics to 
those with known populations. One species distribution model, Maximum Entropy (MaxEnt), is 
the preferred approach when modeling species, like C. harti, that only have a few known 
locations. | used MaxEnt to create a predictive, spatial model for C. harti. The MaxEnt model 
was developed using 14 C. harti occurrence locations and five environmental layers (distance to 
water, soils, geology, landcover, and slope) for six counties in West Central Georgia. Using a 2km 
buffer for background points the model produced a receiver operating characteristic curve (ROC) 
with an area under the curve (AUC) value of 0.97. The high AUC value correlates with the high 
discriminatory power of the model. The five environmental layers were weighted differently 
starting with the most important; distance to water (35.4%), soil (29.1%), landcover (14.8%), 
geology (14.3%), and slope (6.3%). The model’s results covered 6110 km? in Georgia with 
probabilities of C. harti occurrence ranging from: 0%-100% [(0%-10%) 4432 km?, (10%-20%) 
622km?, (20%-30%) 371 km?, (30%-40%) 214 km?, (40%-50%) 150 km2, (50%-60%) 137 km?, (60%- 
70%) 122 km2, (70%-80%) 30 km2, (80%-90%) 30 km2, (90%-100%) 2 km?]. The MaxEnt model was 
evaluated through two different ground truthing methods. The first approach examined the 


model’s overall accuracy by randomly sampling for crayfish at 30 sites across three model 





predicted probability ranges (0%-20%, 40%-60%, 80%-100%). The second method evaluated the 
model output at finer resolutions by comparing probabilities of known C. harti locations to sites 
within 183m of known locations but without crayfish. The first approach yielded no verified C. 
harti locations within any of the sampling brackets. The second method confirmed that the 
model was ineffective at identifying C. harti habitat on large spatial scales (i.e. locally). 

Review of the environmental data layers used to create the model uncovered errors in 
the underlying data. For example, the USGS National Hydrology Dataset was a large source of 
error, with many streams improperly mapped. This data set was used to create the distance to 
water grid. It is clear that data resolution, accuracy and resolution have not advanced to the 
point where these models can justifiably be used to map the potential habitat of this endemic 
burrowing crayfish. Cambarus harti is likely just one of many species this model is inadequate 
for; models for amphibians and other species that rely on ground water or surface water depicted 
by the USGS National Hydrology Dataset would lack adequate data. A high resolution (10m) 


groundwater layer needs to be obtained in order to more accurately model burrowing crayfish 


habitat. 





see 


ACKNOWLEDGEMENTS 

| would like to thank the members of my committee for all the help and engorgement 
they provided along the way. Dr. Troy Keller, thank you for the opportunity to work alongside 
you over the years and for providing me with impromptu guidance when | was struggling. Dr. 
Chester Figiel | would like to thank you for sharing with me some of your hands on experience in 
the field and allowing me to use some of your most recently found C. harti locations in my thesis. 
| would like to thank Dr. Clifton Ruehl for his guidance on field sampling bias as well as statistical 
support. Also | would like to give a big thank you to Zach Sertell for coming out into the field with 
me and helping me search for this crayfish no matter how hazardous the sampling location may 
have been. 


Finally, | would like to thank my family for providing me with an unparalleled amount of 


support at all hours of the day no matter how many times | called home they always picked up. 





POA NEM MIS OCIS VLSI) H ESSE bey Seana aay BRR Aone Sen nh MAIO co cer cee mere Ld Brn mesa msi Sec MCC eee lil 
LAS Ti GASES xcs aca cesasnssevengemsnsvnaapvedanannnactastedeusevaccvaxssataisveraaadsdoestasesstaesaeecadsacaundani sew cravaswaaexennate: Vv 
DIST OR PIG RES 22s5serscs cancels ihetensevsactusendasdvesseanesucusenddues ca supasercoceunant ov iseshasoasidse onscnnaed iadeaseseyndsdaupaa ive vi 
HNIC PIG IN rn urine sas cuurensnacdeeeonsyooiientinesx cate apeencMadesunceas avers Saseiarehenevpeqnvea tnuweuncileden ptece aie asehis'ecedaahhins 1 
WAST ODS iicacicaccnmeames cs carstnacanansaaqaverentaceucs pannkgavateancnnn sm Sannancataredassensettneniadnaadasscifa dieu Rexenatisaeaanavoaandeess 5 
RESUS ox anacec saci naatcnasts erased yaeastceccuabasadceaeetsiiaaatsesece tied cas angi aphdeaneced Fi aseepaassteaseeeoeens isiesbawnstgessess 16 
DES GUS SION cp sad-cscranehantcaneccanascecarte es pines «weietnne oe axeoiisidauiaseoassau eu eeNeaneaasterana ney sassae sanmatnn or aeedaereunaiite te 25 
CON CEU SiO Naa visi catvesvsnzunanyauxsuncacurccnasonsaivinus sia uscins aelacdeusc sath epecunsyncus ess cuswwsn span caeacenabnssss seeavenseusis 29 
LITERATURE GCIME Diss scseasesdvaccesnscassouseachcsseacsunayvagetecnactscossassesateedde ea agian saadsnedssaadiaqeehwasnsssnsveoteasesens 31 
PRINCI ass ates reer eens nates vas dite eestor tas 'c chee Oa dea sa Mian ots oleh apm cwan an egy enteuwate tvndes ab au sbn és cuuniismeMbisurcemias sees 36 
PRP PING Bi daveas dc cwassanth vanes sanssainagacuabiouseaswsassichaspsusmavsssveoubneanaeete esaauavseen<aeekshtassseus tesestbesesearencenasasts 37 
PAPI INDI Gio sy sau vevcensscouwecaaicenoasenebannsttnceatinacecsaqaenecasas sSodadgsdiarbag baad eaitsasdwug alu snaiaeuccduavebcosetmaesatcantad 40 
PERE NDB a xste re ertentarasiercurs onus suaaereccen ca oaeen sunesEsaeocbousans et scauavean dicastandenk easdescaseyeaschshnevuncneauenels 41 
POPPIN DIAG Eoceaciceise sa eenss ch ennanapesuaaedstesvnwaasaicsnonsuddebsanssiasandoacueusddsvaecgenar secon) caus xsd ccdssen teackiheccsamuesanne 50 


PREP NDS Bad, sacvssctec ch ncrne ve dtassavcee- stat bac cbateaseceqncsucungs<dcags tan dodabgameaaeesOkeurasseasdaaieasaasaddeedss Pesbenmnecenass 66 





LIST OF TABLES 

Table 1. Environmental data layers that were used in the Cambarus harti MaxEnt model, including 
their scale: or resoltiGn;GeSCriptlOn, AMG SOU GCE a cs ccscscs ces osha vunes cdasswies ovucoudtvessiebsedutgecmsuesaaceatocddacs Z 
Table 2. The environmental layers and their contribution to the MaxEnt model. The permutation 
importance is the layers correlation with the species while the percent contribution is the value 
for which the model decided to weight that layer..............0:0008 Beh EE rece oO EET TMT 21 
Table 3. An overview of the field validation results from SAMpIING...............se:sssesseeesseseseeeeeeeeeeees 22 


Table 4. Two-way ANOVA comparing model probabilities between presence/absence and 


SAMS lOCATIOMN (Ise; PROPEMEY es iiscsedssaccigcoasaceededcsanueas specunsacierssseieeecanavuacbce aaiaceioes sR sriyssas Seni 23 





vi 


LIST OF FIGURES 


Figure 1. Known Cambarus harti locations (n=13) relative to the six counties in Georgia used in 
PRCA TIEN TE CANOE Urs egeses escevasuaccdscayicevaayadiuss cveacvereante cares cadeSoae ov uscack. peduaesaeueoanad sbsaabaéedasde¥atsnalaguaiceeees 8 


Figure 2. Thirty semi-random locations where sampling of the Cambarus harti occurred. The 0%- 
20% probability locations are shown in blue, the 40%-60% in yellow and the 80%-100% in red. 
The background is an image from ESRI digital globe with an outline of the six counties used in this 


Figure 3. A depiction of the sampling technique used for the field validation of the 30 random 
sites. The circle represents the starting point, the heavy dashed lines symbolize the path walked 
with a rake to set flags along the outside of the sampling area. The light dotted line was the area 
WaAIKEGSeANCHIYG FORD UNOS irc2 doves dsc occas cansaoevse Sous Weveeaaaneaccuves oamecensenesde te vecaswcodesavevarealxetauussesst 13 


Figure 4. Four locations used for field validation of known locations, as well as the six labled 
BOUMPIESILIGEE GS TS SKUSE ON CES) SEG sc aves ssc cezisaraccediacsvats favscuawata pvoasattacsnatatt cpersasthjatdacoanesveaaaens 16 


Figure 5. Effects of sample size on the model’s predictive capability (i.e. % of known locations). 
The Cambarus harti model scored an AUC (area under the curve) of 0.957. The red line 
symbolizes an AUC value for the model while the black line represents a random predicted AUC 
IVES saa aciguccicsvacteaveacaineseioesasncless causaeecdcvoadeaewshcscaeusGza ea seu lavest sasecinieaeuacdgua donna cecibsense ade cis aeienceeeses exe tacusaabo’e 18 


Figure 6. Jackknife of regularized AUC training gain for Cambarus harti model for each layer. A) 
distance to water, B) geology, C) landcover, D) slope, E) soil, and F) all layers combined. The Blue 
color symbolizes how the model’s AUC (area under the curve) will be affected with only that 
variable present and the turquoise shows AUC values with the removal of only that 
NEAL AN Le ahaa cate iss end vara necro vec wd nals calee sta du gvCeCR SOME peck ac (ASG nC: CORPSE ORTON A Redan eae bua bade Lv duac ea eoREKANscdas 18 


Figure 7. Species response for A) distance to water, B) slope, C) soil, D) geology, E) landcover. The 
Y axis symbolizes the probability of occurrence and the X axis depicts the environmental 
conditions (Appendix D). Negative values shown in A and B are model extrapolations but aren’t 
NSEC Gee URIS TAVIS TMOG Is, ee ccustsces oes saustuceue co tevevac o7-nscuce onc ceazetcnsctanuads tx ftapates Sanger Rvieieoeeesavvacend ia tasous 19 


Figure 8. Maxent model predictions for the six counties analyzed in this study. Cool colors (ex: 
Blue/Whites) are areas of low probability of occurrence predictions, while the warmer (ex: Red) 
the color, the greater the probability Of OCCUFFENCE..........:.cccccesstecececssnccecesssseeesonsseecsssnseessenees 20 


Figure 9. Average predicted model percent probabilities from 3 sites with presence (n=36) and 
absence (n=36) of Cambarus harti. Error bars represent 95% confidence intervals with a P-value 
OM= 0:00) DaSea Ori: a tWMO Way PAIN ON Prac ccsessessevevetusstseesascesareneat sai cse epee Wassadeaveusedeah de asedssaedusesteeseracans 23 





Vii 


Figure 10. Predicted average. model percent probabilities from A) Cartwright, B) Chandler, C) 
Warm Springs properties with and without Cambarus harti. Nine presence and nine absence 
probabilities were taken at each of the 3 sites. Error bars represent 95% confidence intervals. P- 
values were obtained from least SqUareS MEANS TESTS...........2ccccccccceeeeseessecceseeeessoeecoseeencnsnansseeess 24 





Introduction 


Biodiversity, the diversity of genes, species, and habitats within ecosystems, is arguably 
the most important driver of ecosystem functions (Zavaleta, 2010). A healthy diverse ecosystem 
is able to provide services such as clean air and water along with aiding the removal of pollutants 
(Schlapfer, 1999). One of the leading threats to biodiversity is the expanding anthropogenic 
degradation to the environment. Threats such as sediment loading, pollution, and sprawling 
cities destroy natural habitat (Singh, 2002). These anthropogenic effects have led to an increase 
in species extinction rates around the globe, and placed other species in peril (Singh, 2002). Often 
overlooked are the less charismatic organisms, such as: worms, ants, microbes, and crayfish, 
among other macroinvertebrates. These small organisms have been shown to play a critical role 
in ecosystems where they process organic matter that facilitates nutrient cycling (Covich et al., 
1999). Despite their small size, these invertebrates (e.g. crayfish, snails and nymphs) are critically 
important for maintaining healthy aquatic ecosystems because they often occur in high densities 
(Wallace & Webster, 1996). 

On a global scale freshwater ecosystems have suffered the largest percent loss of 
biodiversity, making a clear case for the conservation of freshwater habitats (Richman et al., 
2015). Even though freshwater ecosystems only occupy around 1% of the Earth’s surface area 
they support around 10% of all known species (Strayer & Dudgeon, 2010). Among freshwater 
species, crayfish are highly imperiled. Nearly 32% of the approximately 590 freshwater species 


worldwide are at risk of extinction (Richman et al., 2015). 





In the U.S., only 52% of the 363 identified crayfish species are listed as stable; the other 
48% are threatened, endangered or possibly extinct (Taylor et al., 2007). The southeastern 
United States is a hotspot of freshwater biodiversity (Master et al. 1998, Georgia DNR - Wildlife 
Resources Division, 2017) particularly crayfish diversity. Georgia is home to 68 native crayfishes 
and 3 non-native species (Skelton, 2010, Crayfish of U.S., 2017). Of the 68 native species, a third 
of Georgia’s crayfish species are at risk of extinction (Skelton, 2010, Department Of Natural 
Resources Division, 2017) . Some species are at risk due to their low population sizes and their 
limited range (Skelton, 2010, Department Of Natural Resources Division, 2017). 

Cambarus harti, the Piedmont Blue Burrower, is state-listed endangered, endemic 
crayfish species with a distribution limited to areas within and near Meriwether County in West 
Central Georgia (Keller et al., 2011). Cambarus harti is often found in forested wetland habitats 
with shallow groundwater (Keller et al., 2011, Helms et al., 2013, Gilmer, 2014). These scientific 
studies are based on a few observations that had relatively small population sizes and a limited 
number of locations. Studies of other primary burrowing crayfish suggest that crayfish must be 
able to connect with groundwater (Hobbs, 1981, Skelton, 2010, Keller et al., 2011). Crayfish have 
gills in their carapace that must be damp in order for them to respire (Tarr, 1884, Hasiotis, 1993, 
Skelton et al., 2002, Loughman, 2010). Many primary burrowers live near streams (Tarr, 1884, 
Hobbs, 1981,), presumably because the clay soils and impermeable geologic layers elevate 
groundwater near the ground surface. Hobbs, (1981) observed that C. harti was found in 
locations where the terrestrial habitats transitioned into flood plains. 

Hobbs (1981) originally described this species as a primary burrower that inhabits 


underground tunnels connected to the groundwater. This species was commonly found near 





springs and seeps (Hobbs, 1981). Cambarus harti excavates complex sets of tunnels. Some 
tunnels run horizontal into chambers while others are oriented vertical. Hobbs (1981) 
hypothesized that the chambers were developed to provide refuges during fluctuations in the 
groundwater. The burrow openings are often marked with chimneys that are typically 10 to 
15cm in height (Helms et al, 2013). Hobbs (1981) also noted that C. harti retreated down to the 
deepest chamber when an individual’s burrow was being excavated. This behavior made the 
retrieval of the species particularly difficult (Hobbs, 1981). Cambarus harti was described as blue 
in color. The fourth pair of legs of the crayfish, pereiopod, include simple acute hooked ends 
(Hobbs, 1981). Hobbs (1981) describes their first set of pleopods, as leg like features attached to 
the abdomen, extending to the third set of pereiopods. He also noted that both pleopods meet 
flat against each other with acute tips and there is no sign of a notch on the pleopod on the 
ventral side. 

The State of Georgia, as well as the IUCN, list C. harti as endangered due to its narrow 
range and small population size (Cordeiro et al., 2010, Skelton, 2010). These evaluations should 
be considered preliminary, because they are based on limited, and in some cases, historical data. 
For example, the IUCN’s information is limited to 2 populations and 16 specimens (Cordeiro et 
al., 2010). In order to determine adequate conservation strategies, the IUCN stresses that 
scientists must locate more populations. Although several C. harti populations have recently 
been discovered (Skelton, 2002, Keller et al, 2011), conservation planning for C. harti depends on 
the identification of new populations and the collection of additional ecological data. 

The development of species distribution models (SDM) has provided conservationists and 


environmental scientists a suite of new tools that can be used to identify potential habitats for 





species, particularly ones with. specific niche requirements. SDMs are valuable because they 
predict the probability of a species’ occurrence across the geographic landscape (Phillips et al., 
2005). Most SDMs predict the likelihood of a species’ occurrence based on a relationship 
between known locations and user defined habitat related environmental data in the form of 
spatial layers such as soils, land cover, and climate (Guisan & Thuiller, 2005). There are two main 
model types. Presence-absence models require both occurrence and absence locations, while 
presence-only models require known locations only. Although there are a number of different 
species distribution model approaches (DOMAIN, MARS, GAM, GBM, GLM, etc.), only maximum- 
entropy modeling (MaxEnt) has proven effective for endemic species with only a few known 
populations (Wisz et al., 2008). 

MaxEnt is a maximum entropy model with thresholds (Wisz et al., 2008) that can predict 
a species' distribution based on environmental covariates. The model uses prior data, occurrence 
locations and environmental layers, to determine the constraints (i.e. mean, variance) applied to 
the model output. Maximum entropy is so named because there are many different models that 
could fulfill the input constraints. When creating the final model output MaxEnt starts as a 
perfectly uniform probability distribution in geographic space, then it applies the constraints 
forcing the model away from this uniform distribution to create the final model (Elith et al., 2011). 
MaxEnt can be used to analyze categorical and continuous data types. These forms of data can 
be modeled using; linear, quadratic, product, threshold, hinge, and binary associations (Elith et 
al., 2011). In order to produce an accurate model, MaxEnt uses (L-1) regularization to improve 


machine learning (Hastie et al., 2009, Elith et al., 2011). This technique is commonly used when 





multiple factors (i.e. environmental layers) describe one data point, it softens the distribution 
pushing weight onto more explanatory factors (Hastie et al., 2009, Elith et al., 2011). 

In order to create an effective MaxEnt model one must understand how to create and 
revise the model to remove inaccuracies and adjust parameters to assure a good model fit 
(Phillips et al., 2004, Phillips et al., 2005, Elith et al., 2011). The precision of the model is fully 
dependent on the resolution of your spatial data layers. MaxEnt uses prior data collected about 
the species’ location and can be used in applications without absence locations. In the past 
MaxEnt has been successfully applied to rare and endemic species such as the endangered dwarf 
wedgemussel and endemic birds in temperate forests of Southern Chile (Wilson et al., 2011, 
Moreno et al., 2011, Campbell & Hilderbrand, 2016). However, to this author’s knowledge, it has 
only been used to model the distribution of two burrowing crayfish species (Rhoden et al., 2017). 

MaxEnt, with its published use on burrowing crayfish (Rhoden et al., 2017), seems the 
appropriate model for C. harti. Effective conservation of C. harti depends on a more thorough 
understanding of this species’ distribution. My goal is to use MaxEnt to develop a spatial model 
that predicts potential habitat and can be used to expand the known distribution of this species. 
An effective SDM would facilitate research about C. harti needed for the development of an 
effective conservation plan. Only with additional data can scientists help protect this endangered 
species. 

Materials and Methods 
Presence data and environmental variables 
Species distribution models rely on spatially explicit data that could be useful for accurately 


predicting a species’ distribution. The MaxEnt modeling application requires a CSV file containing 





the latitude and longitude of known occurrences (i.e, confirmed locations) and an ASCIl grid for 
each environmental layer (e.g. soils). Using ArcGIS 10.4, all environmental rasters were set to 
the same extent (Fig. 1), cell size (10m), and projection (NAD 1983 Georgia Statewide Lambert) 
in order to be included in the MaxEnt model. Cell size was manipulated by converting all layers 
to match the finest resolution (10m). This approach was an attempt to retain all the data instead 
of aggregating pixels to a larger pixel size resulting in a loss of data. The environmental layers 
soil, geology, landcover, slope, and distance to water were included in the model (Table 1). Soil, 
geology, and landcover were included to describe the burrowing habitat while slope, soil, and 
distance to water could provide indicators of the hydrologic conditions in the area. All of these 
layers were hypothesized to play a role in the C. harti’s habitat requirements. Soil, geology, and 
land cover were originally in vector form and were converted to raster using ArcTools (polygon 
to raster). The model extent included six counties located in west central Georgia: Harris, Talbot, 
Upson, Pike, Meriwether, and Troup (Fig. 1). It is important that the model’s extent is chosen to 
fit the potential range that the species could exist, otherwise the model has a high chance of 
overfitting inadequate locations (Elith et al., 2011). Across the globe there may be many areas 
that have the same environmental conditions required by a species. However, the range of the 
species is a key limiting factor that must be accounted for in the model. The occurrence data 
were configured in MS Excel™ to the specific structure required by the model. Data were 


compiled from research identifying 13 known locations reported in Keller et al, (2011, Fig. 1). 





Table 1. Environmental data layers that were used in the Cambarus harti MaxEnt model, including 
their scale or resolution, description, and source. 
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Environmental Description Source Unit Type Resolution or 





Layer 
Source scale 


Geology 28 geologic Georgia Nominal 1:250,000 
structures, Polygon Clearinghouse 
data 

Soils soil characteristics, Georgia Nominal 1:250,000 
Polygon data Clearinghouse 

Landcover 28 Landcover types USGS Nominal 30m 


across study site, 
raster data 


Slope Created from a USGS Ratio 10m 
digital elevation 
model, raster data 


Euclidian Distance Created from the USGS Interval 1:24,000/1:12,000 


to Water National Hydrology 
Dataset, depicts 


distance from water, 
line and polygon data 
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Figure 1. Known Cambarus harti locations (n=13) relative to the six counties in Georgia used in the 
MaxEnt model. 





MaxEnt Analysis 

The C. harti model was developed using MaxEnt software [version 3.3.3k] (Phillips et al, 
2011). To help reduce potential sampling bias (Dudik et al., 2005, Phillips, 2008) 10,000 random 
background points were collected within a 2-km radius of each known location (Peterman et al., 
2013). The model was set so the probability output distribution produced would be logistic with 
potential maximum probability score equal to 1. MaxEnt was also set to create response curves 
for both the continuous (distance to water, and slope) and categorical (landcover, geology, and 
soil) data. MaxEnt was set to 5000 iterations of the model and to produce a receiver operating 
characteristic curve (ROC) including the area under the curve (AUC) value. The output of the 
model was set to ASCII format so that the resulting predictions could be imported into ArcMap. 

Because the model has the potential to include layers that are uninformative, jackknifing 
was used to evaluate individual layers to determine their importance in the model. The 
jackknifing algorithm runs the model 11 times removing layers while retaining others to 
determine the importance each layer has on the overall model. All input layers were retained in 
the final model (Table 1). ENMeval and ENMTools, packages in R (R Core Team, 2017), were used 
to ensure that MaxEnt doesn’t under or over fit the known distribution (Phillips & Dudik, 2008). 
Model validity was also evaluated by creating a null model calculated from 13 randomly placed 
occurrence points (Raes & Steege, 2007) using ENMtools (Warren et al., 2010). The null model 
was run in MaxEnt the same way my model was and the AUC curves were compared 


quantitatively when it was finished. 
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Field Validation: Random Locations 

Once the model was fully developed, its accuracy was accessed using field surveys for C. 
harti. Because the model generated a grid that predicted the probability that C. harti will be 
found in each cell, its accuracy could be validated by visiting various pixel locations and searching 
for C. harti. Thirty locations were chosen to be sampled and were split into three groups, 10 with 
high presence probability (80%-100%), 10 with medium presence probability (40%-60%), and 10 
with low presence probability (0%-20%). This technique, modeled after Rhoden et al (2017), 
facilitates comparison of model performance among levels. While every effort was made to 
select sampling locations at random, problems with public access required sampling choices to 
be semi-random. Locations were removed if the land was developed, permission wasn’t granted, 
or if the terrain was inappropriate (ex: lake bottom). Thirty semi-random locations were chosen 
from a fully random set of 100 locations (Fig. 2). 

A field sampling protocol was developed to evaluate model predictions. The area 
sampled in the field was matched to the cell size of the raster’s output (10mx10m). To account 
for the patchy distribution of C. harti populations (Hobbs, 1981), the 8 pixels surrounding the 
randomly selected sample location were also sampled (Fig. 3). Thus the total sampling area 
covered 9 cells and a total of 900m? at each of the 30 sample locations (270 pixels from the 
model). This intensive sampling protocol improved the chance of detecting C. harti and reduced 
the potential for false negatives (i.e. missed when present). 

It was important that field validation follow a consistent and effective protocol when 
searching for C. harti. For this study, each location (i.e. 9 cells) was examined for crayfish burrows 


for up to 3 hours. To navigate to the sample location, the latitude and longitude was extracted 





she 


from the centroid of the sample pixel using ArcMap; a Garmin GPSmap76CSx (<10m accuracy) 
was used in the field to navigate to the centroid of the sample location. At the centroid, a picture 
of the location was taken to accurately depict the habitat characteristics in that area. From this 
point, an open wheel tape measure was used to plot 8 different flags around the boundary of the 
sample site (compass directions; N, NE, E, SE, S, SW, W, NW). In order to ensure that the sampling 
matched all of the appropriate pixels from the model, | placed flags 13.71 meters N, S, E, W of 
the centroid and 20.12 meters (NE, SE, SW, NW) (Fig. 3). While walking these lines a rake was 
used to move debris and leaves aside exposing soil to aid in the visual search for crayfish 
chimneys and burrows. The flagged area was further assessed by walking through the entire 
sampling site searching for burrows to ensure equivalent sampling effort was allocated for each 
pixel. If no burrows/chimneys were found at the site, sampling ended prior to the three-hour 
max sampling period. At each site, | recorded the latitude and longitude, noted the time spent 
sampling, took digital images with a 12MP camera and measured any crayfish captured. All 
sample locations were treated similarly regardless of their probability ranking. A Chi-square 
analysis was run comparing the number of burrows dug at different probability value sites. 
When a potential burrow (a hole running vertical into the ground) was located, it was 
excavated slowly and carefully to ensure that the specimen was not harmed in any way. The hole 
was dug out using a shovel until groundwater started to fill the passageway, then a plunger pump 
was used to pump silt-filled water into the burrow in an attempt to force the crayfish to crawl 
out of the burrow. This approach drops the already low dissolved oxygen levels and, has been 


used successfully on two preliminary excavations of C. harti (Keller et al 2013). 








Figure 2. Thirty semi-random locations where sampling of the Cambarus harti occurred. The 0%-20% 
probability locations are shown in blue, the 40%-60% in yellow and the 80%-100% in red. The 
background is an image from ESRI digital globe with an outline of the six counties used in this study. 
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Figure 3. A depiction of the sampling technique used for the field validation of the 30 random 
sites. The circle represents the starting point, the heavy dashed lines symbolize the path 
walked with a rake to set flags along the outside of the sampling area. The light dotted line 
was the area walked searching for burrows. 
Field Validation: Known Locations 

The second phase of the field validation test compared model predicted probabilities at 
sites where the crayfish existed with the surrounding areas where it was absent. This phase 
investigated potential sources of model error at the local scale by determining the accuracy of 
the model at known C. harti locations. Four locations were selected where the species was 


known to exist (Fig. 4). At each of the sites, | used a Trimble Geo 7X with 1-100cm accuracy to 


navigate to the known location, then searched the ground for potential burrows in an outward 
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spiral fashion. Once a burrow was located the latitude and longitude were collected. Because 
each pixel in the model was 10mX10m | made sure that the burrows were 10m apart. This 
approach assured that points would fall in different model pixels. If a burrow was less than 10m 
away this new burrow wasn’t plotted. 

Following the mapping of the known locations, | searched similar habitats without C. harti 
by selecting an area within 183-m of the known location. To reduce spatial bias, the absence 
locations were chosen to be equally proximal to the closest surface water source. Had these sites 
been selected with different distances that would have biased their probability distribution 
because the model is dependent on distance to water. The same outward spiraling search 
technique was used to confirm the absence of burrows. To ensure equivalent sampling effort, | 
searched an area corresponding to nine different model pixels (752m). This resulted in 72 
plotted points, nine presence and nine absences for each of the 4 locations mapped. The latitude 
and longitude for the locations were collected and stored using the Trimble Geo 7X (accuracy = 
1m). 

Analysis of the field data was conducted using ArcMap 10.4. After entering the latitude 
and longitude for presence and absence sites, the data were edited to include the location name 
(Chandler, Cartwright, FDR Institute, or Warm Springs) and status, (presence or absence of C. 
harti) for all 72 points. The MaxEnt model was uploaded to the ArcMap" in ASCII form and 
converted to a raster using ArcTools (ASCII to Raster). In order to extract values at each pixel, 
the data in the table needed to be converted to an integer. The data were then converted from 
a decimal to a percent using raster calculator in ArcTools. The resulting data included many 


decimal places following the percent making it impossible for ArcMap" to produce a raster data 
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table with so many unique values... ArcTools raster calculator was then used to convert the values 
to integers. The map was analyzed visually to ensure that each GPS point fell in a different model 
pixel. Extract by point tool was used to determine the MaxEnt model value at each GPS sample 
location. These data were exported to MS Excel'™ for further analysis. 

A statistical comparison was conducted to determine if the model predictions (dependent 
variable) varied among presence and absence locations (independent variable #1) as well as the 
properties (independent variable #2) using a two-way ANOVA. Levene's Equality of Variances 
test was used to test the assumptions of homoscedasticity. A Tukey post-hoc pairwise 
comparison was used to compare differences between the properties. Each individual 
property's presence and absence ace were compared in SPSS using a difference of least squares 
means, comparing individual properties as well as comparisons between properties (IBM Corp, 
2017). The site labeled the FDR institute was omitted from the ANOVA and least squares means 


analysis, because all of the probability values were equal to 0%. 
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Figure 4. Four locations used for field validation of known locations, as well as the six labled counties 
used as the extent of this study. 


Results 
Presence data 


The presence data (N=14) for the model was based on historical surveys of C. harti (Fig. 
1, Keller et al, 2011). The 14 known locations had slopes that ranged from 0-2% grade and were 
located near surface water (max distance 75m). These locations were recorded mostly in 
hardwood forests where the underlying geology consisted of mica schist and mica schist/gneiss. 
Maxent analysis 

The MaxEnt model scored an overall AUC value of 0.957 (Fig. 5). The model weighed the 


environmental layers as follows: distance to water (35.4%), soil (29.1%), landcover (14.8%), 





Le. 


geology (14.3%), and slope (6.3%). A jackknife analysis showed that each layer played an 
important role in the overall model AUC (Fig. 6), so all layers were retained however landcover 
and slope had the lowest regularized gains. 

The results showed that probability of occurrence had a strong negative relationship with 
distance to water and slope (Fig. 7A). Distance to water drops to a probability of almost 0 beyond 
305m. Slope also drops to probabilities equal to 0 when slope is greater than 4% (Fig. 7B). 
Generally the species’ known locations were found on, mica schist (Fig. 7D) and hardwood forest 
(Fig. 7E). The model’s spatial extent consisted of 6110 km? with probabilities of occurrence 
ranging from 0%-100% [(0%-10%) 4432km/2, (10%-20%) 622km72, (20%-30%) 371km?, (30%-40%) 
214km2, (40%-50%) 150km2, (50%-60%) 137km?, (60%-70%) 122km?, (70%-80%) 30km?, (80%- 
90%) 30km2, (90%-100%) 2km7?] (Fig. 8). As the probability of occurrence increased the amount 
of predicted area decreased. Relative to the whole study area, high probability habitats (80%- 
100%) only encompassd 0.5% of the total area. This pattern is small but reflective of a niche 


species such as C. harti. 
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Sensitivity vs. 1 - Specificity for c.harti 
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Figure 5. Effects of sample size on the model’s predictive capability (i.e. % of known locations). 
The Cambarus harti model scored an AUC (area under the curve) of 0.957. The red line 
symbolizes an AUC value for the model while the black line represents a random predicted AUC 
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Figure 6. Jackknife of regularized AUC training gain for Cambarus harti model for each layer. A) 
distance to water, B) geology, C) landcover, D) slope, E) soil, and F) all layers combined. The 
Blue color symbolizes how the model’s AUC (area under the curve) will be affected with only 
that variable present and the turquoise shows AUC values with the removal of only that 
variable. 
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Figure 7. Species response for A) distance to water, B) slope, C) soil, D) geology, E) landcover. The 
Y axis symbolizes the probability of occurrence and the X axis depicts the environmental 
conditions (Appendix D). Negative values shown in A and B are model extrapolations but aren’t 


used in the underlying model. 
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Figure 8. Maxent model predictions for the six counties analyzed in this study. Cool colors (ex: 
Blue/Whites) are areas of low probability of occurrence predictions, while the warmer (ex: Red) the 


color, the greater the probability of occurrence. 
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Table 2. The environmental layers and their contribution to the MaxEnt model. The permutation 
importance is the layer’s correlation with the species while the percent contribution is the model 
assigned weight. 


(a 


Variable Percent contribution Permutation importance 
Distance to Water 35.4 29.2 
Soil Zon 27.2 
Landcover 14.8 11:8 
Geology 14.3 2.8 
Slope 6.3 29.1 





Field Validation: Random Locations 

Sampling of the three different thresholds, low, medium, and high, resulted in no 
confirmed captures of C. harti. Photos illustrated differences in the herbaceous community 
among the three different thresholds (Appendix D). Wetland plants (i.e. arrowhead plants and 
ferns) were present at many locations of high model predicted probabilities whereas where the 
probability dropped the wetland vegetation became rare. The average time spent sampling 
locations increased from 40.7 min at low probability sites to 49.3 min at high probability sites 
(Table 3). The extended sampling effort was due a greater number of potential burrows dug as 


the quality of the habitat increased (Chi-square, P < 0.01, Table 3). 
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Table 3: An overview of the field validation results from sampling. 


I RE I eee 
Sampling Probability Average Time Spent Standard Deviation Potential Borrows 


Dug 
Low (0%-20%) 40.7min 3min 0 
Medium (40%-60%) 44.1min 6.8min 2 
High (80%-100%) 49.3min 10.4min 10 


Field Validation: Known Locations 

To assess the model’s predictive capacity at local scales, this study compared the 
probability scores at 3 sites with and without C. harti burrows. Counter to expectations model 
probabilities where crayfish weren’t observed ranked higher than where they were observed 
(Fig. 9). Absence sites showed 20% higher probability scores (ANOVA, P<0.001, Table 4) than 
sites with C. harti present (Fig. 9). FDR Institute was removed from the ANOVA analysis, because 
all values at that site scored 0% probability. There existed statistical difference among sites 
(ANOVA, P<0.001, Table 4). The largest difference existed between Cartwright and Warm Springs 
(P<0.001) and the smallest between Chandler and Warm Springs (Least Squares Means, P>0.14). 
There was a significant interaction term present in the data (ANOVA, P<0.001, Table 4), because 
there was no significant difference between presence and absence at the Chandler and 
Cartwright locations (Least Squares Means, P>0.57, Fig. 10) while there was significant difference 


at Warm Springs (Least Squares Means, P<0.001, Fig. 10). 
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Table 4. Two-way ANOVA comparing model probabilities between presence/absence and 
sampling location (i.e. property). 





Factor Sum of Squares df Mean Square’ _ F p 
Presence 2078.2 i! 2078.2 23.289 < 0.001 
Property 17662.3 2 8831.13 98.964 < 0.001 
Presence > Property 807.1 2 403.57 4.523 0.016 
Residual 4283.3 48 89.24 
100 P <0.001 
90 
> 50 
» 
5 £ 70 P<0.001 
aa | 
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Figure 9. Combined average predicted model percent probabilities from 3 sites with presence 
(n=36) and absence (n=36) of Cambarus harti. Error bars represent 95% confidence intervals. 
The reported P-value is based on a two-way ANOVA. 
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Figure 10. Predicted average model percent probabilities from A) Cartwright, B) Chandler, C) 
Warm Springs properties with and without Cambarus harti. Nine presence and nine absence 
probabilities were taken at each of the 3 sites. Error bars represent 95% confidence intervals. 
P-values were obtained from least squares means tests. 
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Discussion 


MaxEnt has proven to be a powerful tool for expanding the known distributions of 
endangered species (Williams et al., 2009). This modeling approach was used in the Maryland 
Coastal Plain to predict sites for the endangered dwarf wedgemussel (Campbell & Hilderbrand, 
2016) and endemic birds in temperate forests of Southern Chile (Moreno et al., 2011). MaxEnt 
has been successful when developing SDMs in cases where there are small sample sizes of only 
occurrence data (Hernandez et al., 2006, Wisz et al., 2008). Furthermore, it has already been 
used to model burrowing crayfish, Fallicambarus harpi and Procambarus reimeri in Arkansas 
(Rhoden et al., 2017). Given that C. harti was only known from 14 locations, MaxEnt was the 
appropriate species distribution model for this species. However, the results of this research 
appears to contradict previous publications regarding the effectiveness of MaxEnt for modeling 
rare burrowing crayfish (e.g. Rhoden et al., 2017). 

The first form of ground validation for this model was conducted across thirty random 
locations, covering nearly 270 different model pixels. This intensive sampling resulted in no new 
occurrences of C. harti, not even in the high probability sites. There are several possible 
explanations for these findings. Previous research done on C. harti as well as other endangered 
burrowers indicated the potential for detection problems (Hobbs 1981). Hobbs (1981) reported 
that these crayfish are so limited in their distribution that it would require extensive sampling of 
all microhabitats within one location to confirm their presence (Hobbs, 1981). Time constraints 
allowed a total sample area of 83.54 square meters at each of the 30 validation sites: 10 high 


probability (100%-80%), 10 medium probability (40%-60%), and 10 low probability (0%-20%). 
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There exists the possibility that crayfish burrows were located in close proximity but were 
undetected during visual surveys. At three sites (11, 18, 30) with probabilities above 50%, 
vegetation such as ferns and broadleaf arrowhead indicated wetland conditions and likely the 
presence of shallow ground water (Kane et al, 2002). These indicator species are common at 
sites where this species is known to occur (Hobbs, 1981). The concern that a crayfish site existed 
just beyond the sampling locations was confirmed when a newly discovered location 


(32.884913 -84.69916 “Y), was reported 180m upstream from ground truthing site 14 


(Appendix A). Not finding C. harti at any of the sites could indicate potential problems with the 
model fit (Tomarken & Waller, 2003) or the patchy distribution of rare species (Hobbs, 1981). 
There exists potential for inaccuracies in SDM models even with high AUC’s. MaxEnt 
determined the permutation of importance for each layer; distance to water, soil, geology, 
landcover, and slope (Table 2). From this the model compared how abundant these 
environmental conditions were across the landscape and determined the level of contribution 
that each layer would be weighted in the final model (Table 2). The model itself scored an AUC 
value of 0.957. This means that 95.7% of the time a random pixel chosen will score lower than 
one where the species is known to occur. This indicates that the habitat at the known locations 
were unique compared to the surrounding areas. Thus endangered endemic species with 
particular habitat requirements would be expected to have high AUCs (Rhoden et al., 2017). 
Researchers have identified problems with AUC values as indicators of model fit. A high AUC 
value could result from the model’s limited capacity to estimate the habitat requirements 


(Warren & Seifert, 2011). This limited capacity could result from small sample sizes resulting in 
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a strong, but spurious correlation. Additionally, there exists potential for predicted absences and 
pseudo-absences (i.e. background points) to artificially inflate the AUC (Lobo et al., 2008). In this 
study, these potential pitfalls were taken into consideration by selecting adequate background 
points that were within 2km of each known location by following the recommendations of 
Peterman et al. (2013). 

Because questions remained unanswered regarding the quality of the model at the local 
scale, a second phase of ground truthing was implemented. This approach examined the model 
on a larger scale (i.e. finer resolution) by comparing model probabilities at 4 locations where 
burrows were known to occur versus nearby sites where no burrows existed. While a significant 
difference was found between the model probabilities of these two different areas; surprisingly, 
locations without C. harti outscored the locations with C. harti (Tables 4 and 5). These findings 
proved that the model was not able to accurately predict burrows at a fine resolution. The limited 
sample size C. harti is something that must be taken into consideration. Studies on SDM’s, such 
as the one by Hernandez (2006), indicate that MaxEnt preforms the best with limited sample 
sizes. However, at these small sample sizes the prediction success for the model could be as low 
as 20%. Wisz et.al (2008) also reported that MaxEnt preforms the best with limited sample sizes 
however at low sample sizes there exists a lower chance of prediction success. 

One possible explanation for the model’s poor performance, is that the known locations 
used for it weren’t accurate. Site inaccuracies have been a problem for studies that extract 
occurrences from the literature (Newbold, 2010). However, all the sites used in the present study 
had been recently confirmed (Keller et al, 2011). This source of error seems unlikely to explain 


this model’s poor performance. 
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Another source of error in the model could be traced to the environmental layers used to 
make the model itself. One study addressed the concern about standards for data collection and 
called for rigorous quality control plan for spatial data (Cayuela et al., 2009). Standards should 
be set to ensure that collection technique and data reporting form help ensure an adequate level 
of precision. Other Inaccuracies can cause the model to make false predictions particularly when 
there is lack of data and/or gaps in the existing data (Araujo & Guisan, 2006, Guisan & Thuiller, 
2005). Gaps in data layers and data quality may have contributed to potential model inaccuracies 
identified by ground truthing in this study. 

To determine if data layers contributed to the model prediction errors, the layers were 
examined visually in ArcMap. Locations where the species was known to exist were compared 
to each of the environmental layers and cross referenced with field observations. At all locations, 
it was observed that surface water from the USGS National Hydrology Dataset (NHD) showed 
inaccuracies (Appendix B, Image1-4). For example, at the Warm Springs property the NHD is 
missing a spring upwelling that forms a small stream (Appendix B, image 3). The known C. harti 
location was located in an area adjacent to this spring that was not identified in the NHD. In 
another case, the FDR Institute site scored a probability value of 0%, because the environmental 
layers inaccurately depicted the hydrography of the location (Appendix B, Image 4). At that site, 
there are crayfish living on both sides of a small spring-fed stream. According to the National 
Hydrology Dataset this stream, doesn’t exist or hasn’t been recorded. These errors suggest that 
the hydrography dataset contained important inaccuracies. Unfortunately, those data sets were 
used to formulate the model. Errors associated with missing or misplaced streams, contributed 


to an inaccurate final MaxEnt model. 
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In the future, the stream data should be revised and corrected especially if it will be used 
for another SDM. The model is only as good as the data used to create it. When data sets are 
lacking or inaccurate, the model will be unable create reliable predictions. 

For burrowing crayfish models, an environmental layer that could improve model 
prediction inaccuracies in the NHD layer would be a layer depicting shallow groundwater. To 
date, there exists no groundwater layer for this area in Georgia. Considering that C. harti as well 
as other burrowing crayfish rely on groundwater connectivity, a layer like this would help create 
amore accurate species distribution model. Rhoden et al (2017) also lacked a groundwater layer, 
however they constructed a simple groundwater indicator. Their ground truth sampling did find 
the target species. However, they failed to properly test their model because they chose to 
sample only in roadside ditches where they knew water was present. By only sampling in ditches, 
the known preferred habitat of their crayfish, they may have biased the ground truthing and their 
estimate of the model’s accuracy. In order to accurately ground truth the model the technique 
would need to be revised and sample the entire model landscape, rather than a selected subset 


of locations more likely to sustain the species. 


Conclusion 


SDM models have proven effective when modeling rare species, including burrowing 
crayfish (Rhoden et al., 2017). The quality of environmental layers and their relevance to the 
species being modeled, controls the accuracy of the model predictions. In this study the 


predicted distribution consisted of 6110km? with probabilities of occurrence ranging from: 0%- 








30 


100%. The resulting MaxEnt model showed significant inaccuracies in its prediction. One 
identified problem was that the environmental layers used in the model contained errors. For 
example analysis revealed that the USGS National Hydrology Dataset was missing springs and 
small streams, critical habitat for C. harti. These problems combined with the limited number of 
known locations contributed to the model’s inaccurate predictions. Furthermore, critical 
environmental layers, such as surficial groundwater, does not exist for this area. These data are 
needed to accurately depict C. harti habitat. Until the data for the environmental layers are more 
fully developed and properly ground truthed, the value of SDM’s for modeling rare burrowing 


crayfish will be limited. 
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APPENDIX A 


Chester's New, Location 
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Semi-rendom sample location 14 is plotted on a map overlay along with a realitivly new location 
discovered by Chester Figiel. The newly discovered location was not far from the random sample that 
resulted in a non-presence finding. 
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APPENDIX B 
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Image 1. The Cartwright property with the NHD hydrology layer imposed on an image (blue Line). There 
is a stream to the east of this point (red Line) however the NHD map didn’t include the smaller creek. 
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Image 2. The Chandler property with the NHD hydrology layer imposed on an image (blue line). The 
stream is drawn to the NE of my sample location however the streams true location (red line) runs 
parellel to my sample site before taking a bend and entering the easment. 
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Image 3. The Warm Springs property with the NHD hydrology layer imposed on an image (blue Line). 


The image depicts a stream to the east of the sample location however there is also a spring head (red 
dot) located at the sampling point and water from it flows towards the stream (red outline) before 
infiltrating into the ground. 
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blue line). 


Image 4. The FDR Institute property with the NHD hydrology layer imposed on an image ( 


This image doesn’t depict any water at this location, while sampling the point on the map a stream was 
evident (red line). The stream ran from up the hill towards my point and then under the road through a 


constructed stream crossing culvert. 
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Appendix C 
Difference of Least Squares Means 
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Presence Standard DF T P 
Error Value Value 
3.1488 4S 7.96 <.0001 
3.1488 48 14.03 <.0001 
3.1488 48 6.07 <.0001 
Present 2.5710 48 4.83 <.0001 
Present 4.4531 48 1.47 0.6832 
Absent 4.4531 48 5.54 <,0001 
Present 4.4531 48 7.19 <.0001 
Absent 4.4531 48 8.03 <.0001 
Present 4.4531 48 13.27 <.0001 
Absent 4.4531 48 4.07 0.0023 
Present 4.4531 48 5.71 <.0001 
Absent 4,4531 48 6.56 <.0001 
Present 4.4531 48 11.80 <.0001 
Present 4.4531 48 1.65 0.5726 
Absent 4.4531 48 2.50 0.1459 
Present 4.4531 48 7.73 <.0001 
Absent 4.4531 48 0.85 0.9566 
Present 4.4531 48 6.09 <.0001 
Present 4.4531 48 5.24 <.0001 
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APPENDIX D 
Response curve tables 
Table 1. Geology Dataset Attributes 
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Table 2. Landcover Dataset Attributes 


Open sand, sandbars, mud and some sand dunes - 
natural environments as well as exposed sand from 
dredging and other activities. Mainly in coastal areas, but 


Landcover Type 


| 


11 Open Water Lakes, rivers, ponds, ocean, industrial water and 
aquaculture. 


Transportation Roads, railroads, airports and runways. 


Utility Swaths Open swaths maintained for transmission lines. 
















also inland, especially along the banks of reservoirs. 












Low Intensity Urban - 
Nonforested 


High Intensity Urban 


Clearcut - Sparse 
Vegetation 


Quarries, Strip Mines 


Parks, Recreation 
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Low intensity urban areas with little or no tree canopy. 


Commercial/industrial and multi-family residential 
areas. 


Recent clearcuts, sparse vegetation and other early 
successional areas. 


Exposed rock and soil from industrial uses, gravel pits 
and landfills. 


Cemeteries, playing fields, campus-like institutions, 
parks and schools. 


Ee Pasture, Hay Pasture and non-tilled grasses. 


Forested Urban - 
Deciduous 


Forested Urban - 
Evergreen 


Forested Urban - Mixed 


Hardwood Forest 


Xeric Hardwood 


Row crops, orchards, vineyards, groves and horticultural 
businesses. 


Low intensity urban areas containing mainly deciduous 
trees. 


Low intensity urban areas containing mainly evergreen 
trees. 


Low intensity urban areas containing mixed deciduous 
and evergreen trees. 


Mesic to moderately mesic forests of the lower Piedmont 
and Coastal Plain. Includes non-wetland floodplain 
forests of yellow-poplar and sweetgum, ravines of oaks 
and American beech, and many upland oak-hickory 
stands. 


Dry hardwood forests found throughout the state, 
although most common in the mountain regions, and 
progressively more rare southward. Includes areas 
dominated by southern red oak, scarlet oak, post oak 
and blackjack oak. 











Open Loblolly-Shortleaf 
Pine 


Xeric Mixed Pine- 
Hardwood 


Mixed Pine-Hardwood 


Loblolly-Shortleaf Pine 


Sandhill 


Longleaf Pine 


Cypress-Gum Swamp 


Bottomland Hardwood 
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Only mapped in the Piedmont. Includes older, fairly open 
stands that may be almost savanna-like in appearance. 


Dry mixed forests found throughout the state, although 
most common in the mountain regions and 
progressively more rare southward. Includes areas 
dominated by a mix of pines (most frequently shortleaf 
or Virginia in the mountains, and shortleaf or longleaf 
elsewhere) and hardwood species such as southern red 
oak, scarlet oak, post oak and blackjack oak. 


Mesic to moderately dry forests of mixed deciduous and 
evergreen species found throughout the state at lower 
elevations. May include areas dominated by sweetgum, 
yellow-poplar, various oak species and loblolly or 
shortleaf pine. 


Found from the upper Coastal Plain northward (rare in 
the Blue Ridge except at the lowest elevations). Includes 
many stands heavily managed for silviculture as well as 
areas regenerating from old field conditions. 


Areas of scrub vegetation on deep, sandy soils on the 
Coastal Plain, especially near the Fall Line and along 
larger streams. May be dominated by turkey oak, 
blackjack oak, live oak, holly and longleaf pine. 


Open, savanna-type stand. Heavily managed plantations 
would likely be classed with 440 or 441. Most common 
on the lower Coastal Plain, although found up to the 
lower Piedmont and historically in the Ridge and Valley. 


Regularly flooded swamp forests mainly found on the 
Coastal Plain. May include either riparian or 
depressional wetlands. Usually dominated by pond or 
baldcypress and/or tupelo gum. 


Less frequently flooded wetland forests found 
throughout the state, but most common on the Coastal 
Plain. To the north, may be dominated by sweetgum, 
elms and red maple. To the south, wetland oaks (water 
oak, willow oak, overcup oak, swamp chestnut oak), 
black gum, and even spruce pine become more common. 
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Emergent freshwater wetlands found throughout the 
state. May be dominated by grasses or sedges. 


Closed canopy, low stature woody wetland. Found 
throughout the state, although most common on the 
Coastal Plain. May be result of clearcutting of wetland 
forests. Frequently includes willows, alders and red 
maple. 


Restricted to the Coastal Plain. Includes forests 
dominated by bay species, wet pine forests (typically 
slash or pond pine) or Atlantic white cedar. 





Table 3. Soil Dataset Attributes 


PERM | HYGRP | DRAIN | IFHYDRIC | AFLDFREQ 







Pi. 
4.2 


28.1 


.96 


A= 
ee) 
N 


28 
=O; 
2.6 


© 
ray 


Ww 
N 
Ww 
so 
NI 


WwW 
Ww) ff 
oo} ©] & 
SNS) ON 
) 
tS 
nN] w 
N; © 


Ww 
Ww 
a 
Ni 


WwW 
W 
KO 
™S 
© 
“sy 


1o%) 
ge) 
© 
NS 
Ww) we 
10°) co 


WwW 
10.¢) 


Led ae 
N 


2.6 


a 
N 
N 
me 
any 


f ~ 3 
~ — 











46 








a : Se mines 
aA 

t+ 

ig 


m jo [© [Ss mS |e 
oO) (=) © a oO) oO) OY 
lag) ~m ioe) + fag) (a9) fag) 
N IN [AN : : 
lag) ~ ~ ioe) mM a9) ~ 
S| S| oi —I fo) 
N N N m~- ioe) N 
am 





~ « 

oO) 

fap) 

Nf 

~ ine) 

~ 

(@)) ie 
N WO WO 00 WO WO << A faa) WO WO WO a psy a 
2: s 3 me . a ee S S By 


7 

















47 








© IQ [Rm fo [QM jo Ja |NH [QD |M |M | |H JO |M [MH JM |H Io DA 1D [OD 
(ogiemnitcove Wikoolemniitcoy © Mook siidaol SSiitcop) oon oak 9 iio Maoh Toy = Bool i|aoly Mons oes too pe ool ae a in: 0k we Aitao yy SN iias) 


Re: SG Ve Ais ieee Sis. TN RS RS ieee Nie i a ee Vie es IL ie, KG: PS. tse es 
OD + O) O) + OD (ep) + + + + OD t+ OD + + + OD + Oo + + + 
a) + ~m a9) + ~m a9) + t+ t+ + a9) + (a9) + + t+ ~m + mM + + + 
CN 4 14 a |N . CN 4 
io) a8) fo) mM ~m ~ ~ m~ ~ mM ~m faa) (a9) fag) m~ fog) lag) ~ fo) ” lag) lag) 0) 
4 BS is ea a 7 x 6 - 
N N N N N N N N N N N N N N N N N N N fa) N N N 
~ N ™ ~ ~ ~ ~ ~ ~ ™ ™~ N ms WO ~ ™~ ~ 
AN 1S N N [NS [XN [N N og ae cn A a a a 
N S| S| N 4 N N i ol S| a N a N S| S| 4 - — N - om bn | 
a A hee Ped Ve fie! hex [Ai ie tee fe ee ea GR NS ihe YN iN ja 
© oO oO ie) oO oO © i=) io) io) oO -) © (eo) ie) (=) =) © io) Ss i=) (=) 
m~ + WO fo) WO W ~ ~ ~ mM WO (a9) WO mM (a9) faa) i fo) m~ (99) mM 
POA et er hel fer Ter le ieek be it tes ter es fer Tea ben fet yer tea ee 
(=) © i) i=) i=) oO (eo) (=) ie) i) i=) (ee) © © © S) io) (=) © © j=) ie) (en) 
1 10 |H |H JO |H |M JH [oO Jo |o |H [OY JM |O Jo JH JM JO Ja [OE JH |e 
ta) co N w= N N Ea) Ww Ww ite) N Ea) N Ln Wa) ita) 1e.6) ita) fa) Ta) w= Ww 
~ fo) N m~ ~m ioe) Mm ~ [a9] ~ (oa) ~ fo) ~ fo) fog) (a8) N ~m N fog) faa) mM 
N A [o,@) N 4 N N A = - en. N 4 N a S| bn | co bos | be a S| 
ie Os (os eae ie het Het alleeti ilvet Leet |e Gilet.” tiles: det taint Net Slee Obeid iets | pea |e 
oO oO jo) ie) i=) i=) =) i=) i=) i=) ‘S) i=) oO oO i=) Oo io) i=) i=) i=) oO oO j=) 
WO mS 00 (@)) — N ~m SE ita) WO MS lee) Oo Oo | N ion) + Ln WO MS ioe) 
~m fa) fog) fog) i SS + + + + + + i ita) wa) a) Fa) Ha) te) Ln Ly a) 





26 


2.9 








48 


3.8 


un ~ 
pe S = S - a = a. N rep) 2 eS » 6 2 © 
~ st + st st st st + Ln aa) + r= ~ ~ ~ ~ 


0.1 


: 
1.06 
197 
1.06 
2.76 
5.48 


23 


fo ffir ea 


WO N 
CS S 
= a |o a |— {a 
i=) io) =) ie) ie) 
~ + Fa) WO ~ oe) OD a 
~ ™ ™~ ™ ™ ~ ™ co 


0:23 


ma 
35.6 
ai 
=a 
as 


ri on 


- 


eee ee 





49 


39.9 096 0.3 ee ee 29 


S 
NO 
OD 


rant 
N 
Uo 
so 
—~SN 


N 
[@)) 
N 
W 


ie i 
~ 


Ts 
—~N 
2 
oOo 
—~N 
WO 
N 
N 
~ 
—N 


N 
Ww 
i 
N 
NI 
a a 
= 
N 


N 
Ww 
N 
ee 
N 
NI 
ms 
= 
N 
es) 


N 
Ww 
~ 
N 
NI 
i 
oo 
Ni 
Ww 
wo 


N 
Oo 
Ww 
U2 
so 
NI 
UW 
00 


© 
N 
D 
~ 
Ww 
Ww 
io 
—~SN 


WwW 
oo 


un 

iS 

fore) 

N 
: ; : : : ; and : . : : aad Uo 
io co (ore) co Co) wo io 


N 
WW 
i=) 


ee 
uw 
© 
—~S 
uw 


wi 
= N id N f N N N N N = N 
wo ee N ee N a ray ee ran 
r t Ww N : r : G2 : Uo wo WW r Wo Wo 
& N oO N N N & 
ee) 
to 
NI 
(S) ie) 
my ee 
WW 
co 


nd 
IB 
rove) 
Ww 


Si Se 
Ma NN) OM N 
Oo) f@p) 
S) Si Sl -S Sl So eG) Gl So) io) soy Sl 6 
Ul NY); Bl Ww COs ACO Gon IN PO ea KOO i 


ce) 
WW 
G 


© 
N 
N 
i 
~S 
(ep) 
Ww 
oe 
N 
Wo 


Te 
~S 
aD 
oo 
~S 
NO 
NO 
ed 
~ 
W 
wo 


as 
co 
od 
BS 
00 
me 
NI 
(ore) 
Ww 
oO 
Ww 
w 
(oe) 





AWC = available water capacity (inches/inch) 

CLAY = clay content of soil (% of soil < 2mm in size) 

KFFACT = soil erodibility f-factor 

OM = organic matter content (% by weight) 

PERM = permeability rates (inches/hour) 

HYGRP = soil index variables (1=well drained to 4=poorly drained) 

DRAIN = soil index variable (1=well drained to 7=poorly drained) 

LL = liquid limit of the soil (%moisture by weight) 

IFHYDRIC = hydric soil indicator (1 if hydric) 

AFLDFREQ = annual flood frequency (1 = frequent (>50% chance) 
2 = occasional (5-50% chance), 3 = rare (<5% chance) 
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Appendix E 


This is a compliation of many different site locations. Each set of pictures includes a location map 
depicting the area to help give reference followed by an image from the site that was taken in the 
centroid of my sample location. The series of images are split up into three different groups based on 
the probablity of occurrence at these locations (high, medium, low). The images depict a change in the 
groundcover as you move from high probablity to low. Within the higher probablities you will see 
groundcover that denotes shallow groundwater while the low probablities lack any cover and if there is 
some it doesn’t indicate the presence of groundwater. 
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Sample Location 8 
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Appendix F 
List of locations for all reandom sampled locations along with there probablity ranking. 
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